435 research outputs found

    SOAP3-dp: Fast, Accurate and Sensitive GPU-based Short Read Aligner

    Get PDF
    To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS), most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously. Compared with widely adopted aligners including BWA, Bowtie2, SeqAlto, GEM and GPU-based aligners including BarraCUDA and CUSHAW, SOAP3-dp is two to tens of times faster, while maintaining the highest sensitivity and lowest false discovery rate (FDR) on Illumina reads with different lengths. Transcending its predecessor SOAP3, which does not allow gapped alignment, SOAP3-dp by default tolerates alignment similarity as low as 60 percent. Real data evaluation using human genome demonstrates SOAP3-dp's power to enable more authentic variants and longer Indels to be discovered. Fosmid sequencing shows a 9.1 percent FDR on newly discovered deletions. SOAP3-dp natively supports BAM file format and provides a scoring scheme same as BWA, which enables it to be integrated into existing analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and Tianhe-1A.Comment: 21 pages, 6 figures, submitted to PLoS ONE, additional files available at "https://www.dropbox.com/sh/bhclhxpoiubh371/O5CO_CkXQE". Comments most welcom

    Superconductivity in a new layered cobalt oxychalcogenide Na6_{6}Co3_{3}Se6_{6}O3_{3} with a 3d5d^{5} triangular lattice

    Full text link
    Unconventional superconductivity in bulk materials under ambient pressure is extremely rare among the 3dd transition-metal compounds outside the layered cuprates and iron-based family. It is predominantly linked to highly anisotropic electronic properties and quasi-two-dimensional (2D) Fermi surfaces. To date, the only known example of the Co-based exotic superconductor was the hydrated layered cobaltate, Nax_{x}CoO2⋅_{2}\cdot yH2_{2}O, and its superconductivity is realized in the vicinity of a spin-1/2 Mott state. However, the nature of the superconductivity in these materials is still an active subject of debate, and therefore, finding new class of superconductors will help unravel the mysteries of their unconventional superconductivity. Here we report the discovery of unconventional superconductivity at ∼\sim 6.3 K in our newly synthesized layered compound Na6_{6}Co3_{3}Se6_{6}O3_{3}, in which the edge-shared CoSe6_{6} octahedra form [CoSe2_{2}] layers with a perfect triangular lattice of Co ions. It is the first 3dd transition-metal oxychalcogenide superconductor with distinct structural and chemical characteristics. Despite its relatively low TcT_{c}, material exhibits extremely high superconducting upper critical fields, μ0Hc2(0)\mu_{0}H_{c2}(0), which far exceeds the Pauli paramagnetic limit by a factor of 3 - 4. First-principles calculations show that Na6_{6}Co3_{3}Se6_{6}O3_{3} is a rare example of negative charge transfer superconductor. This new cobalt oxychalcogenide with a geometrical frustration among Co spins, shows great potential as a highly appealing candidate for the realization of high-TcT_{c} and/or unconventional superconductivity beyond the well-established Cu- and Fe-based superconductor families, and opened a new field in physics and chemistry of low-dimensional superconductors

    MICA: A fast short-read aligner that takes full advantage of Many Integrated Core Architecture (MIC)

    Get PDF
    Background: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alterative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC card contains only ~60 cores (while a GPU card typically has over a thousand cores). Results: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner MICA that is optimized in view of MIC's limitation and the extra parallelism inside each MIC core. By utilizing the 512-bit vector units in the MIC and implementing a new seeding strategy, experiments on aligning 150 bp paired-end reads show that MICA using one MIC card is 4.9 times faster than BWA-MEM (using 6 cores of a top-end CPU), and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICA's simplicity allows very efficient scale-up when multiple MIC cards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM). Summary: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour using 400 nodes. MICA has impressive performance even though MIC is only in its initial stage of development. Availability and implementation: MICA's source code is freely available at http://sourceforge.net/projects/mica-aligner under GPL v3. Supplementary information: Supplementary information is available as "Additional File 1". Datasets are available at www.bio8.cs.hku.hk/dataset/mica.published_or_final_versio

    Full-length single-cell RNA-seq applied to a viral human cancer:applications to HPV expression and splicing analysis in HeLa S3 cells

    Get PDF
    Background: Viral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied HeLa is a well characterized HPV+ cervical cancer cell line Result: We developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins Conclusion: Our results reveal the heterogeneity of a virus-infected cell line It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers

    Whole exome sequencing identifies frequent somatic mutations in cell-cell adhesion genes in chinese patients with lung squamous cell carcinoma

    Get PDF
    Lung squamous cell carcinoma (SQCC) accounts for about 30% of all lung cancer cases. Understanding of mutational landscape for this subtype of lung cancer in Chinese patients is currently limited. We performed whole exome sequencing in samples from 100 patients with lung SQCCs to search for somatic mutations and the subsequent target capture sequencing in another 98 samples for validation. We identified 20 significantly mutated genes, including TP53, CDH10, NFE2L2 and PTEN. Pathways with frequently mutated genes included those of cell-cell adhesion/Wnt/Hippo in 76%, oxidative stress response in 21%, and phosphatidylinositol-3-OH kinase in 36% of the tested tumor samples. Mutations of Chromatin regulatory factor genes were identified at a lower frequency. In functional assays, we observed that knockdown of CDH10 promoted cell proliferation, soft-agar colony formation, cell migration and cell invasion, and overexpression of CDH10 inhibited cell proliferation. This mutational landscape of lung SQCC in Chinese patients improves our current understanding of lung carcinogenesis, early diagnosis and personalized therapy

    The diploid genome sequence of an Asian individual

    Get PDF
    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics
    • …
    corecore